2 research outputs found

    Cloud-based textual analysis as a basis for document classification

    Get PDF
    Growing trends in data mining and developments in machine learning, have encouraged interest in analytical techniques that can contribute insights on data characteristics. The present paper describes an approach to textual analysis that generates extensive quantitative data on target documents, with output including frequency data on tokens, types, parts-of-speech and word n-grams. These analytical results enrich the available source data and have proven useful in several contexts as a basis for automating manual classification tasks. In the following, we introduce the Posit textual analysis toolset and detail its use in data enrichment as input to supervised learning tasks, including automating the identification of extremist Web content. Next, we describe the extension of this approach to Arabic language. Thereafter, we recount the move of these analytical facilities from local operation to a Cloud-based service. This transition, affords easy remote access for other researchers seeking to explore the application of such data enrichment to their own text-based data sets

    Learning-based approaches for reconstructions with inexact operators in nanoCT applications

    Full text link
    Imaging problems such as the one in nanoCT require the solution of an inverse problem, where it is often taken for granted that the forward operator, i.e., the underlying physical model, is properly known. In the present work we address the problem where the forward model is inexact due to stochastic or deterministic deviations during the measurement process. We particularly investigate the performance of non-learned iterative reconstruction methods dealing with inexactness and learned reconstruction schemes, which are based on U-Nets and conditional invertible neural networks. The latter also provide the opportunity for uncertainty quantification. A synthetic large data set in line with a typical nanoCT setting is provided and extensive numerical experiments are conducted evaluating the proposed methods
    corecore